Picture for Shuaiqiang Wang

Shuaiqiang Wang

Not All Preferences Are Created Equal: Stability-Aware and Gradient-Efficient Alignment for Reasoning Models

Add code
Feb 01, 2026
Viaarxiv icon

MatchTIR: Fine-Grained Supervision for Tool-Integrated Reasoning via Bipartite Matching

Add code
Jan 15, 2026
Viaarxiv icon

Adversarial Yet Cooperative: Multi-Perspective Reasoning in Retrieved-Augmented Language Models

Add code
Jan 08, 2026
Viaarxiv icon

Reinforced Efficient Reasoning via Semantically Diverse Exploration

Add code
Jan 08, 2026
Viaarxiv icon

Beyond Monolithic Architectures: A Multi-Agent Search and Knowledge Optimization Framework for Agentic Search

Add code
Jan 08, 2026
Viaarxiv icon

DiffuGR: Generative Document Retrieval with Diffusion Language Models

Add code
Nov 19, 2025
Viaarxiv icon

Efficient Thought Space Exploration through Strategic Intervention

Add code
Nov 13, 2025
Viaarxiv icon

Thinking Forward and Backward: Multi-Objective Reinforcement Learning for Retrieval-Augmented Reasoning

Add code
Nov 13, 2025
Viaarxiv icon

Can LLM Annotations Replace User Clicks for Learning to Rank?

Add code
Nov 10, 2025
Viaarxiv icon

AdaSwitch: Adaptive Switching Generation for Knowledge Distillation

Add code
Oct 09, 2025
Figure 1 for AdaSwitch: Adaptive Switching Generation for Knowledge Distillation
Figure 2 for AdaSwitch: Adaptive Switching Generation for Knowledge Distillation
Figure 3 for AdaSwitch: Adaptive Switching Generation for Knowledge Distillation
Figure 4 for AdaSwitch: Adaptive Switching Generation for Knowledge Distillation
Viaarxiv icon